119 research outputs found

    CLARIN: Norwegian and Nordic perspectives

    Get PDF
    Proceedings of the NODALIDA 2009 workshop Nordic Perspectives on the CLARIN Infrastructure of Language Resources. Editors: Rickard Domeij, Kimmo Koskenniemi, Steven Krauwer, Bente Maegaard, Eiríkur Rögnvaldsson and Koenraad de Smedt. NEALT Proceedings Series, Vol. 5 (2009), 42-45. © 2009 The editors and contributors. Published by Northern European Association for Language Technology (NEALT) http://omilia.uio.no/nealt . Electronically published at Tartu University Library (Estonia) http://hdl.handle.net/10062/9207

    Contagious "Corona" Compounding by Journalists in a CLARIN Newspaper Monitor Corpus

    Get PDF
    Newspaper monitor corpora, which incorporate new materials on a regular basis, are particularly useful for tracking linguistic changes spurred by current developments. The COVID19 pandemic prompted a case study in the Norwegian Newspaper Corpus. The corpus was mined for productive compounds with the stems “corona” and its alternative spelling “korona”, tracing their frequencies and dates of first occurrence during the first wave of the pandemic. The quantitative analysis not only monitored the daily volume and variation of such compounds, but also the dynamics of vocabulary growth, and a change in their preferred spelling. The paper concludes with reflections on methodology and data sources.publishedVersio

    NLP for writing: What has changed?

    Get PDF
    Proceedings of the Workshop on NLP for Reading and Writing – Resources, Algorithms and Tools (SLTC 2008). Editors: Rickard Domeij, Sofie Johansson Kokkinakis, Ola Knutsson and Sylvana Sofkova Hashemi. NEALT Proceedings Series, Vol. 3 (2009), 1-11. © 2009 The editors and contributors. Published by Northern European Association for Language Technology (NEALT) http://omilia.uio.no/nealt . Electronically published at Tartu University Library (Estonia) http://hdl.handle.net/10062/4116

    Preface

    Get PDF
    Proceedings of the Sixth International Workshop on Treebanks and Linguistic Theories. Editors: Koenraad De Smedt, Jan Hajič and Sandra Kübler. NEALT Proceedings Series, Vol. 1 (2007), vii. © 2007 The editors and contributors. Published by Northern European Association for Language Technology (NEALT) http://omilia.uio.no/nealt . Electronically published at Tartu University Library (Estonia) http://hdl.handle.net/10062/4476

    The META-NORD language reports

    Get PDF
    Proceedings of the NODALIDA 2011 Workshop Visibility and Availability of LT Resources. Editors: Sjur Nørstebø Moshagen and Per Langgård. NEALT Proceedings Series, Vol. 13 (2011), 23–27. © 2011 The editors and contributors. Published by Northern European Association for Language Technology (NEALT) http://omilia.uio.no/nealt . Electronically published at Tartu University Library (Estonia) http://hdl.handle.net/10062/1697

    Theoretically Motivated Treebank Coverage

    Get PDF
    Proceedings of the 16th Nordic Conference of Computational Linguistics NODALIDA-2007. Editors: Joakim Nivre, Heiki-Jaan Kaalep, Kadri Muischnek and Mare Koit. University of Tartu, Tartu, 2007. ISBN 978-9985-4-0513-0 (online) ISBN 978-9985-4-0514-7 (CD-ROM) pp. 152-159

    Guidance for Citing Linguistic Data

    Get PDF
    Linguistic data, in their many forms, are a valuable asset in research and education on language. From the predigital age, the earliest data to reach us are written records carved in stone, wooden sticks, or clay tablets, or penned on papyrus, parchment, and such. Early field linguists recorded samples obtained from informants and other sources in notebooks and card files. Speech was recorded on analog devices such as wax cylinders, phonograph records, and magnetic tape. Consultation of such materials as cited in studies was usually cumbersome, but their citation was often relatively straightforward. In the early digital age, materials were shipped on digital tape reels or CD- ROM, and citation consisted of references to physical media. Nowadays, most digital materials are made available online. This has clear implications for the practice of citation. Furthermore, the use of digital data in linguistics has greatly expanded in volume and variety. Primary data in the form of large digital corpora of text, audio, and video have become widely available and are often annotated at one or more linguistic levels. Some other types of digital data (in the wide sense of the term) relevant for research on language are lexicons, term banks, word nets, computational grammars, translation memories, survey results, quantitative data from experiments, and so on. Locating specific data that were used in studies would amount to looking for a needle in a haystack were it not for proper citation. Unfortunately, citation practices haven’t fully kept pace with new kinds of digital data and their distribution. In this chapter, we sometimes use the more general term resource when referring to different types of digital research products, including, for instance, language models and analyzers (e.g., grammars, parsers), annotation tools, statistical code associated with certain data sets, and other digital assets. Often, we mention data for simplicity but most guidelines for data also hold for other resources. A data set is a set of data items that is distributed as a whole, but often we use data and data set interchangeably. The guidance given in this chapter is primarily targeted at authors of linguistic publications, while a secondary audience consists of academic publishers and resource providers such as repositories and archives

    Acknowledgments

    Get PDF
    Proceedings of the Sixth International Workshop on Treebanks and Linguistic Theories. Editors: Koenraad De Smedt, Jan Hajič and Sandra Kübler. NEALT Proceedings Series, Vol. 1 (2007), v. © 2007 The editors and contributors. Published by Northern European Association for Language Technology (NEALT) http://omilia.uio.no/nealt . Electronically published at Tartu University Library (Estonia) http://hdl.handle.net/10062/4476

    Datamaskinell skrivestøtte

    Get PDF
    [Der findes ikke resumé til denne artikel
    corecore